Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search
نویسندگان
چکیده
The Smith-Waterman algorithm is a dynamic programming method for determining optimal local alignments between nucleotide or protein sequences. However, it suffers from quadratic time and space complexity. As a result, many algorithmic and architectural enhancements have been proposed to solve this problem, but at the cost of reduced sensitivity in the algorithms or significant expense in hardware, respectively. Hence, there exists a need to evaluate the tradeoffs between the different solutions. This motivation, coupled with the lack of an evaluation metric to quantify these tradeoffs leads us to formally define and quantify the sensitivity of homology search methods so that tradeoffs between sequence-search solutions can be evaluated in a quantitative manner. As an example, though the BLAST algorithm executes significantly faster than Smith-Waterman, we find that BLAST misses 80% of the significant sequence alignments. This paper then presents a highly efficient parallelization of the Smith-Waterman algorithm on the Cell Broadband Engine, a novel hybrid multicore architecture that drives the PlayStation 3 (PS3) game consoles, and emulates BLAST by repeatedly executing the parallelized Smith-Waterman algorithm to search for a query in a given sequence database. Through an innovative mapping of the optimal Smith-Waterman algorithm onto a cluster of PlayStation 3 nodes, our implementation delivers a 10-fold speed-up over a high-end multicore architecture and an 88-fold speed-up over a non-accelerated PS3. Finally, we compare the performance of our implementation of the Smith-Waterman algorithm to that of BLAST and the canonical Smith-Waterman implementation, based on a combination of three factors — execution time (speed), sensitivity, and the actual cost of deploying each solution. In the end, our parallelized Smith-Waterman algorithm approaches the speed of BLAST while maintaining ideal sensitivity and achieving low cost through the use of PlayStation 3 game consoles.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملFast Image Database Search using Tree-Structured VQ∗
In this paper, we exploit the techniques of tree structured vector quantization (TSVQ), branch and bound search, and the triangle inequality to speed the search of large image databases. Our method can reduce search computation required to locate images which best match a query image provided by a user. While exact search is possible, a free parameter allows search accuracy to be reduced, there...
متن کاملFast Image Database Search Using Tree- Structure VQ
In this paper, we exploit the techniques of tree structured vector quantization (TSVQ), branch and bound search, and the triangle inequality to speed the search of large image databases. Our method can reduce search computation required to locate images which best match a query image provided by a user. While exact search is possible, a free parameter allows search accuracy to be reduced, there...
متن کاملEffects of Project Uncertainties on Nonlinear Time-Cost Tradeoff Profile
This study presents the effects of project uncertainties on nonlinear time-cost tradeoff (TCT) profile of real life engineering projects by the fusion of fuzzy logic and artificial neural network (ANN) models with hybrid meta-heuristic (HMH) technique, abridged as Fuzzy-ANN-HMH. Nonlinear time-cost relationship of project activities is dealt with ANN models. ANN models are then integrated with ...
متن کاملFPGA architecture for pairwise statistical significance estimation
Sequence comparison is one of the most fundamental computational problems in bioinformatics. Pairwise sequence alignment methods align two sequences using a substitution matrix consisting of pairwise scores of aligning different residues with each other (like BLOSUM62), and give an alignment score for the given sequence-pair. This work 1 addresses the problem of accurately estimating statistica...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008